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DETAILED ACTION 
Continued Examination Under 37 CFR 1.114 

1 . A request for continued examination under 37 CFR 1 . 1 1 4, including tine fee set 
forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this 
application is eligible for continued examination under 37 CFR 1.114, and the fee set 
forth in 37 CFR 1 .17(e) has been timely paid, the finality of the previous Office action 
has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 
February 27, 2006 has been entered. 



Response to Arguments 

2. Applicant argues that the embodiments of the present invention provide for the 
use of not only semantic information, but also for pragmatic information contained in a 
reliably recognizable part of the speech phrase that is useful to explain another part of 
higher perplexity. Applicant argue that the prior art individually or in combination fail to 
teach or suggest that pragmatic information contained in the at least one low-perplexity 
part with respect to at least one of the high-perplexity part may be used to explain the at 
least one of the high-perplexity part. However, applicant's arguments are not 
persuasive. 

Ehsani teaches that the operation of a voice-interactive application entails 
processing acoustic, syntactic, semantic, and pragmatic information derived from the 
user input in such a way as to generate a desired response from the application 
(column 1 1 , paragraph 0216). Ehsani also teaches that if n-gram is part of a larger 
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string collocation the choice of words adjacent to the phrase boundary will be very 
small, because of the internal constraint of the collocation. Conversely, the likelihood 
that a particular word will follow is very high. For example, the word following the 
trigram "to a large" will almost always be "extent" which means the perplexity us low, 
and the trigram is subsumed under the fixed collocation "to a large extent." On the 
other hand, a large number of different words can precede or follow the phrase "to a 
large extent", and the probability that any particular word will follow is very small (close 
to 0), columns 5-6, paragraph 0102. 

Claim Rejections - 35 USC § 103 

3. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 1 02 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
Invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

4. Claims 1-2, 4-5, 9, 14 and 18-21 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Jiang et al. (U.S. Patent No. 6,539,353), hereinafter referenced as 
Jiang in viev\^ of Ehsani et al. (U.S. Publication No. 2002/0128821), hereinafter 
referenced as Ehsani and in further view of Kimura et al. (USPN 6,067,510), hereinafter 
referenced as Kimura. 
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As per claims 1 and 14, Jiang discloses a method and apparatus for recognizing 
speech, comprising: 

(a) the steps of receiving a speech phrase (100, FIG. 2); 

(b) generating a signal being representative to said speech phrase using A/D 
converter (102, FIG. 2); 

(c) using feature extractor for pre-processing and storing said signal (104, FIG. 

2); 

(d) generating from said pre-processed signal at least one series of hypothesis 
speech elements (Col. 1, line 51-53); 

(e) determining at least one series of words being most probable to correspond 
to said speech phrase by applying a predefined language model to at least said series 
of hypothesis speech elements (Col. 4, lines 13-16), 

wherein the step of determining said series of words further comprises the steps 

of: 

(1) identifying a hypothesis string consisting of sub-word units (Col. 1, lines 52- 
55) then continuing determining words or combinations of words and which are 
consistent with said seed sub-phrase as at least a first successive sub-phrase which is 
contained in said received speech phrase (Col. 6, lines 38-46 with CoL 5, lines 28-51 
and Col. 4, lines 33-44), but lacks identifying and extracting word classes of high- 
perplexity, applying a compiler, merging the sub-word-unit grammars with the remaining 
low-perplexity part and inserting additional information. 
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Ehsani discloses phrase-based dialogue modeling method for producing a low- 
perplexity recognition grammar from a conventional grammar having semantic 
information including a description between sub-phrases (column 3, paragraphs 0034- 
0043) comprising: 

(a) identifying and extracting word classes (trigram subsumed under the fixed 
collocation) of high-perplexity (very high perplexity) from the conventional grammar 
(column 5, paragraphs 0100-0102); 

(b) generating a phonetic, phonemic and/or syllabic description (phone models 
and phonetic dictionary; column 1 1 , paragraph 0217) of high-perplexity word classes 
(very high perplexity), in particular by applying a sub-word-unit grammar compiler to 
them (column 11, paragraphs 021 1-0214 with column 10, paragraphs 0199-0200), to 
produce a sub-word-unit grammar for each high-perplexity word class (column 5, 
paragraphs 0100-0102); 

(c) merging sub-word-unit grammars (combining) with remaining low-perplexity 
part of the conventional grammar to yield said low-perplexity recognition grammar 
(column 4, paragraphs 0064 with column 6, paragraph 0107), to measure the strength 
of certain collocations; 

wherein said seed sub-phrase is recognized with an appropriate high degree of 
reliability, such that segments of speech that are recognized with high reliability are 
used to constrain the search in other areas of the speech signal where the language 
model employed cannot adequately restrict the search (column 3, paragraph 0059, 
column 5, paragraph 100 and column 11, paragraph 0221). 
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wherein said at least one series of words ("to a large") substantially comprises at 
least one low-perplexity part ("extent") which can be analyzed and recognized with a 
high degree of reliability, and remaining parts which are treated as high-perplexity parts 
(large number of different words; columns 5-6, paragraph 0102), and 

wherein pragmatic information contained in said at least one low-perplexity part 
(column 11, paragraph 0216) with respect to at least one of said high-perplexity parts 
may be used to explain said at least one of said high-perplexity parts (columns 5-6, 
paragraph 0102). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to modify Jiang's method wherein it identifies and extracts 
word classes of high-perplexity, applies a compiler, merges the sub-word-unit grammars 
with the remaining low-perplexity part and constrain the search and provides pragmatic 
information contained in a reliably recognizable part of the speech phrase that is useful 
to explain another part of higher perplexity, to measure for determining the average 
branching factor of a recognition network, for evaluating language models (column 5, 
paragraph 0100) to generate a desired response from the application (column 11, 
paragraph 0216). 

Jiang in view of Ehsani discloses a method and apparatus for recognizing 
speech, but does not specifically teach inserting additional information. 

Kimura teaches inserting additional, higher order information (hierarchy), 
including semantic (semantic features), between the sub-phrases, thereby decreasing 
the burden of searching (greatly reduce labor and time to search; column 3, lines 43- 
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51), wherein the semantic information includes description of the sub-phrases (column 
5, lines 38-56 with column 12, lines 22-26 and column 15, line 36-43). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to modify Jiang in combination with Ehsani's method and 
apparatus such that it discloses inserting additional information, to sort and display 
words hierarchically in a particular order when displaying the words as candidates for 
substitution so that a time for retrieving the words can be reduced, as taught by Kimura 
(column 2, lines 1-6). 

As per claim 2, Jiang et al. disclose the use of a language model (110, FIG. 2) to 
provide additional information about the set of probabilities that a particular sequence of 
words will appear in the language of interest (Col. 4, lines 33-44) 

As per claims 4 and 5, Jiang et al. discloses that language model (110, FIG- 2) 
is a compact trigram model that determines the probability of sequence of words based 
on the combined probabilities of three-word segment of the sequence. (Col.4, lines 41- 
44). Inherently, trigram language models take prepositional relationships of sub- 
phrases into account when calculating probabilities. 

As per claim 9, Jiang et al. discloses the use of Hidden Markov Models for 
estimating probabilities for any sequence of sub-words generated by lexicon (Col. 4, 
lines 23-30). 

As per claim 15, Jiang discloses a method and apparatus for recognizing 
speech, but does not specifically include information relating to grammatical constraints 
among said sub-seed. 
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Ehsani discloses a speech recognition method and apparatus including 
information relating to grammatical constraints among said sub-seed (column 1 1 , 
paragraph 0221), to narrow down the hypotheses generated by the acoustic signal. 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to modify Jiang's method and apparatus wherein it 
includes information relating to grammatical constraints among said sub-seed, to narrow 
down the hypotheses generated by the acoustic signal, to come up with a number of 
possible commands that are processed by the system (column 1 1 , paragraph 0221 ). 

As per claim 16, Jiang discloses a method and apparatus for recognizing 
speech, but does not specifically include grammatical constraints for a name of a city. 

Ehsani discloses a speech recognition method and apparatus including 
grammatical constraints for a name of a city (column 10, paragraph 0196), to enable the 
phrase thesaurus to be represented more compactly. 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to modify Jiang's method and apparatus wherein it 
includes grammatical constraints for a name of a city, to enable the phrase thesaurus to 
be represented more compactly thus decreasing the data storage capacity required to 
store the data representing the phrase thesaurus (column 10, paragraph 0197). 

As per claim 17, Jiang disclose a method and apparatus for recognizing speech, 
but does not specifically discloses pragmatic information including digital postal code for 
the city. 
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Ehsani teaches that the descriptors include businesses, restaurants, cities, etc, 
(column 10, paragraph 0196), to enable the phrase thesaurus to be represented more 
compactly. 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to modify Jiang's method and apparatus such that it 
includes a 5-digit postal code for the city, to allow the information to be received 
hierarchically with a large variety of different domains (column 2, paragraph 0022). 

As per claims 18 and 20, Jiang disclose the method and apparatus for 
recognizing speech, but lacks wherein said seed sub-phrase recognized with an 
appropriate high degree of reliability is defined as a low perplexity part of said received 
speech phrase. 

Ehsani disclose the method wherein said seed sub-phrase recognized with an 
appropriate high degree of reliability is defined as a low perplexity part of said received 
speech phrase (column 3, paragraphs 0034-0043 with column 4, paragraphs 0064 and 
column 6, paragraph 0107), to measure the strength of certain collocations. 
Therefore, it would have been obvious to one of ordinary skill in the art at the time the 
invention was made to modify Jiang's method and apparatus wherein said seed sub- 
phrase recognized with an appropriate high degree of reliability is defined as a low 
perplexity part of said received speech phrase, as taught by Ehsani, to measure for 
determining the average branching factor of a recognition network, for evaluating 
language models (column 5, paragraph 0100). 
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As per claims 19 and 21, Jiang discloses the method wherein perplexity is 
defined as the complexity of the depth of search which has to be accomplished in 
conventional search graphs or search trees (column 4, lines 45-57). 

5. Claims 6-7 and 10-12 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Jiang in view of Ehsani and Kimura, as applied to claims 1 and 14 
above, and in further view of Chou et al. (U.S. Patent No. 5,797,123), hereinafter 
referenced as Chou. 

As per claim 6 and 7, Jiang in view of Ehsani and Kimura does not disclose the 
use of low-perplexity and high-perplexity pads in the system. 

Chou teaches limited vocabulary word spotting (low perplexity) with a parallel 
network of subword models used to model the non-keyword portions of the input 
utterance (high-perplexity) (Col. 2, lines 61-65). Inherently, sub-word models contain 
word fragments. 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to modify Jiang in combination with Ehsani and Kimura's 
method and apparatus, as taught in Chou, in order to improve the speed of recognition 
by quickly identifying commonly-used words using low-perplexity vocabulary and then 
proceeding to identify the less-common words by resorting to more expansive 
computations. 
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As per claim 10, Jiang in view of Ehsani and Kimura does not disclose the 
insertion of high-perplexity word classes into hypothetic graph. 

Chou teaches the insertion of functional words and filler phrases into the 
detection network to improve recognition of key-phrases (Col. 6, lines 47-56). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to modify Jiang in view of Ehsani and Kimura's method 
and apparatus, as taught in Chou, in order to handle repeating speech patterns and 
thus speed up the search and improve recognition. 

As per claim 11, Jiang in view of Ehsani and Kimura do not disclose the removal 
of candidates from the hypothetical graph. 

Chou teaches the merging of the states of the key-phrase network, thus reducing 
its size (Col. 7, lines 40-46). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to modify Jiang combination with Ehsani and Kimura 
method and apparatus, as taught in Chou, in order to prune the passed nodes while 
doing the search through the hypothetical network and thus limit the possibility to 
accidentally encroach upon the beginning of another phrase. 

As per claim 12, Jiang in view of Ehsani and Kimura do not disclose restricting 
the remaining part of the key-phrase. 

Chou teaches placing additional constraints on the search that inhibit impossible 
connections of key-phrases (Col. 6, lines 64-65). 
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Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to modify Jiang in combination with Ehsani and Kimura 
method and apparatus, as taught in Chou, in order to improve the speed of recognition 
by quickly removing impossible combinations from the search graph and thus limiting 
the search space. 

Conclusion 

6. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Jakieda R. Jackson whose telephone number is 
571.272.7619. The examiner can normally be reached on Monday through Friday from 
7:30 a.m. to 5:00p.m. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, David Hudspeth can be reached on 571 .272.7843. The fax phone number 
for the organization where this application or proceeding is assigned is 571-273-8300. 
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Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 
JRJ April 14,2006 
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